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ABSTRACT.  This  paper  discusses  a  functional  approa^K^to  the  problem  of  compar¬ 
ison  of  multi-samples  (two  samples  or  e  samples,  where  c  ^  2).  The  data  consists  of  e 
random  samples  whose  probability  distributions  are  to  be  tested  for  equality.  A  diversity 
of  statistics  to  test  equ^ity  of  c  samples  are  presented  in  a  unified  framework  with  the 
aim  of  helping  the  researcher  choose  the  optim^  procedures  which  provide  greatest  insight 
about  how  the  samples  differ  in  their  distributions.  Concepts  discussed  are:  sample  distri¬ 
bution  functions;  ranks;  mid-distribution  function;  two-  sample  t  test  and  nonparametric 
Wilcoxon  test;  multi-sample  analysis  of  variance  and  Kniskal  Wallis  test;  Anderson  Darling 
and  Cramer  von  Mises  tests;  components  and  linear  rank  statistics;  comparison  distribu¬ 
tion  and  comparison  density  functions,  especially  for  discrete  distributions;  components 
with  orthogonal  polynomial  score  functions;  chi-square  tests  and  their  components. 

1.  INTRODUCTION.  We  assume  that  we  are  observing  a  variable  F  in  c  cases  or  sam¬ 
ples  (corresponding  to  c  treatments  or  e  populations!.  The  samples  can  be  regarded  as  the 
value  of  c  variables  Y\,...,Yc  with  respective  true  aistribution  functions  Fi(y), . . . , Fc{y) 
and  quantile  functions  Qx(u))  *  •  •  <  Qel^)-  We  call  Fi, . . . , Fc  the  conditioned  variables  (the 
value  of  F  in  different  populations). 

The  general  problem  of  comparison  of  conditioned  random  variables  is  to  model  how 
their  distribution  functions  vary  with  the  value  of  the  conditioning  variable  k  =  1, . . . ,  c, 
and  in  particular  to  test  the  hypothesis  of  homogeneity  of  distributions: 

Ho:Fi*...=:Fc  =  F 

The  distribution  F  to  which  all  the  others  are  equal  is  considered  to  be  the  unconditional 
distribution  of  F  (which  is  estimated  by  the  sample  distribution  of  F  in  the  pooled  sample). 

S.  DATA.  The  data  consists  of  c  random  samples 

YkU)J  = 

for  fc  =  1, . . . ,  c.  The  pooled  sample,  of  size  N  =  ni  +  . . .  +  ric,  represents  observations  of 
the  pooled  (or  uncon^tional)  variable  F.  The  e  samples  are  assumed  to  be  independent 
of  each  other. 

S.  SAMPLE  DISTRIBUTION  FUNCTIONS.  The  sample  distribution  functions  of 
the  samples  are  defined  (for  —  oo  <  y  <  oo)  by 

Fk"{y)  —  fraction  <  y  among  Fjt(.). 

The  unconditional  or  pooled  sample  distribution  of  F  is  denoted 

F“(y)  =  fraction  <  y  among  Fjfe(.),  k  =  1, . . . ,  c. 

We  use  ‘  to  denote  a  smoother  distribution  to  which  we  are  comparing  a  more  raw 
distribution  which  is  denoted  by  a  '.  An  expectation  (mean)  computed  from  a  sample  is 
denoted  E~. 
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4.  RANKS,  MID-RANKS,  AND  MID-DISTRIBUTION  FUNCTION.  Nonparamet- 
ric  statistics  use  ranks  of  the  observations  in  the  pooled  sample;  let 

Rk{t)  denote  the  rank  in  the  pooled  sample  of  Yk{t). 

One  can  define  i2jfe{0  =  NF"{Yk{t)). 

In  defining  linear  rank  statistics  one  transforms  the  ramk  to  a  number  in  the  open  unit 
interval,  tisually  Rfc{ty(N  +  1).  We  recommend  {Rk{t)  —  .5)/N.  These  concepts  assume 
all  observations  are  aistinct,  and  treat  ties  by  using  average  ranks.  We  recommend  an 

approach  which  we  call  the  “mid-rank  transform”  which  transforms  !*(<)  to 
defining  the  mid-distribution  function  of  the  pooled  sample  Y  by 

P^{y)  =  -  •5p‘(y). 


We  call 

p"'(y)  =  fraction  equal  to  y  aunong  pooled  sample 
the  pooled  sample  probability  mass  function. 

5.  SAMPLE  MEANS  AND  VARIANCES.  When  the  random  variables  are  assumed 
to  be  normal  the  test  statistics  are  based  on  the  sample  means  (for  k  =  1, . . . ,  c) 

t=l 


We  interpret  Yk~  as  the  sample  conditional  mean  of  Y  given  that  it  comes  from  the  kth 
population.  The  unconditional  sample  meam  of  V  is 

y-  =  fr[ri  =  p.irr-h...  +  p.cn-, 

defining 

P.jfe  =  «jb/^ 

to  be  the  fraction  of  the  pooled  sample  in  the  kih.  sample;  we  interpret  it  ais  the  empirical 
probability  that  an  observation  comes  from  the  kth  sample. 

The  unconditionad  and  conditional  variances  au'e  denoted  9 


VAR-IKI  =  (./V.  ^  -  Y-}^ 

K=1  j=l 

VAR'lnl  =  (l/nj)  f^iYtU)  - 
y=i 

Note  that  our  divisor  is  the  sample  size  N  or  n*.  rather  than  iV  —  c  or  Ujt  -  1.  The  latter 
then  airise  as  fau:tors  used  to  define  F  statistics. 

We  define  the  pooled  vairiance  to  be  the  mean  conditional  variance: 

=  Ep.t  vAR-ini 
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6.  TWO  SAMPLE  NORMAL  T  TEST.  In  the  two  sample  case  the  statistic  to  test 
Hq  is  usually  stated  in  a  form  equivalent  to 

T  =  {yr  -  Y2-}/o‘{{N/(N  -  2))((l/ni)  +  (l/nj))}  * 

We  believe  that  one  obtains  maximum  insight  (and  analogies  and  extensions)  by  expressing 
T  in  the  form  which  compares  Yi~  with  Y~: 

r  =  {(jv  -  2)p,i/(i  -  p.i)}*{yr  -  r-y/c 

The  exact  distribution  of  T  is  t{N  —  2),  t-distribution  with  N  -  2  degrees  of  freedom. 

7.  TWO-SAMPLE  NONPARAMETRIC  WILCOXON  TEST.  To  define  the  popular 
Wilcoxon  non-paraunetric  statistic  to  test  Hq  we  define  Wj^  to  be  the  sum  of  the  rijt  ranks 
of  the  Yfg  values;  its  mean  and  variance  are  given  by 

E[Wk\  =  nkiN  +  l)/2,  VAR(Wjfc]  =  niriiiN  +  1)/12 

The  usual  definition  of  the  Wilcoxon  test  statistic  is 

Tk  =  [Wk  -  B|IVtl)/{VAR(»'t|}  ‘. 

The  approach  we  describe  in  this  paper  yields  as  the  definition  of  the  nonparametric 
Wilcoxon  test  statistic  (which  can  be  verified  to  approximately  equal  the  above  definition 

of  Ti,  up  to  a  factor  {1  —  (1/iV)^}  ®) 

r,  =  {i2(Ar  -  i)p.,/(i  -  p.i)}'(Rr  -  .5}, 

defining 

t=l 

=  m/niN)  -  {1/2N) 

One  reason  we  prefer  this  form  of  expressing  non*parametric  statistics  is  because  of  its 
relation  to  mid>ranks; 

Rk-  =  friR-n)] 

One  should  notice  the  analogy  between  our  expressions  for  the  pareunetric  test  statistic 
T  and  the  nonparametric  test  statistic  Ti;  the  former  has  an  exact  t{N  -  2)  distribution 
and  the  latter  has  asymptotic  distribution  Normal{0, 1}. 


8.  TEST  OF  EQUALITY  OF  c  SAMPLES  NORMAL  CASE.  The  homogeneity  of 
c  samples  is  tested  in  the  parametric  normal  case  by  the  analysis  of  variance  which  starts 
with  a  fundamental  identity  which  in  our  notation  is  written 

VAR-|y|  =  ^  p,*{n-  -  Y-f  +  a-* 

Ar=l 

The  F  test  of  the  one-way  analysis  of  variance  can  be  expressed  as  the  statistic  or _ 

T^  =  '£  ?.k\Tk\\ 

k=l 
c 
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defining 


r*  =  (AT  -  c){Yt-  -  Y-y/a- 
TFt  =  {(N  -  e)p.t/(l  -  P.t)}  '{n'  -  >'■}/<’■ 

The  asymptotic  distribution  of  T^j (c  —  1)  and  TF^  are  F{c  - 1,  iV  -  c)  and  F(l,  iV  -  c) 
respectively. 

9.  TEST  OF  EQUALITY  OF  e  SAMPLES  NONPARAMETRIC  KRUSKAL- 
WALLIS  TEST,  The  Kriiskal- Wallis  nonparametric  test  of  homogeneity  of  c  samples 
can  be  shown  to  be 


TKW^  = '^(1  - 

k=l 

TKWt  =  {12(N  -  l)p.t/(l  -  p,t)yHnk-  -  -5} 

The  asymptotic  distributions  of  T KW^  and  T are  chi-squared  with  c  —  1  and  1 
degrees  of  freedom  respectively. 

10.  COMPONENTS.  We  have  represented  the  analysis  of  variance  test  statistic 
and  the  Kruskal- Wallis  test  statistic  T KW'^  as  weighted  sums  of  squares  of  statistics  TF]f 
and  TKWfg  respectively  which  we  call  components,  since  their  values  should  be  explicitly 
calculated  to  indicate  the  source  of  the  significance  (if  any)  of  the  overall  statistics.  Other 
test  statistics  that  can  be  defined  can  be  shown  to  correspond  to  other  definitions  of 
components. 

11.  ANDERSON  DARLING  AND  CRAMER  VON  MISES  TEST  STATISTICS.  Im- 
portwt  among  the  many  test  statistics  which  have  been  defined  to  test  the  equality  of 
distributions  are  the  Anderson-D2krling  and  Cramer-von  Mises  test  statistics.  They  will 
be  introduced  below  in  terms  of  representations  as  weighted  sums  of  squares  of  suitable 
components. 

IB.  COMPARISON  DISTRIBUTION  FUNCTIONS  AND  COMPARISON  DEN- 
SITTY  FUNCTIONS.  We  now  introduce  the  key  concepts  which  enable  us  to  unify  and 
choose  between  the  diverse  statistics  available  for  comparing  several  samples.  To  compzire 
two  continuous  distributions  F(.)  amd  if  (.),  where  if  is  a  true  or  smooth  and  F  is  a  model 
or  raw,  we  define  the  comparison  distribution  function 

£>(u)  =  D{u]H,F)  =  F(if"^{u)) 


with  comparison  density 

d{u)  =  d{uiH,F)  =  £>'{u)  =  f(H-'^[u))/h[H-\u)). 

Under  Hq  :  H  —  F,  D{u)  =  u  and  <f(tt)  =  1.  Thus  testing  Hq  is  equivalent  to  testing 
D{u)  for  uniformity. 

Sample  distribution  functions  are  discrete.  The  most  novel  part  of  this  paper  is  that 
we  propose  to  form  an  estimator  D’'{u)  from  estimators  II~{-)  and  F"(.)  by  using  a  general 
definition  of  D{.)  for  two  discrete  distributions  ff(.)  and  F(.)  with  respective  probability 
mass  functions  pv  and  pp  satisfying  the  condition  that  the  values  at  which  pff  are  positive 
include  all  the  values  at  which  pp  are  positive. 
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IS.  COMPARISON  OF  DISCRETE  DISTRIBUTIONS.  To  compare  two  discrete 
distributions  we  define  first  d{u)  and  then  D{u)  as  follows: 

d(u)  =  d{u-,H,F)  =  Pf(II~^(ii))/Pff(II~^(fi}}y 

D(u)  =  f  d{t)dt. 

Jo 

We  apply  this  definition  to  the  discrete  sample  distributions  F*  and  F~ ^  to  obtain 


and  its  integral  Dk'M. 

We  obtain  the  following  definition  of  d*.“(u)  for  the  c  sample  testing  problem  with  all 
values  distinct: 

=  N/rik  if  (RkU)  -  l)/N  <  u  <  RkU)/NJ  =  1 . 

=  0,  otherwise, 

A  component,  wHh  score  function  J(u),  is  a  linear  functional 

TkiJ)  =  /  J(u)djb”(u)du 

Jo 

It  equals 

which  can  be  approximated  by  JEr[J(P*(yjk))]. 

14.  LINEAR  RANK  STATISTICS.  The  concept  of  a  linear  rank  statistic  to  compare 
the  equality  of  c  saunples  does  not  have  a  universally  accepted  definition.  One  possible 
definition  is 

nv)  =  (i/"*)  E  J((«t(j)  -  ■^)/N) 

;=1 

However  we  choose  the  definition  of  a  linear  rank  statistic  as  a  lineu  functional  of  dk  (u), 
which  we  call  a  component;  it  is  approximately  equal  to  the  above  formula. 

We  define 

TkV)  =  ((^  -  1)  VAR[J(f/)]p.ifc/(l  -  p.ifc))  ®  J(u){dfc”(u)  -  l}du  (!) 

Jo 


rRkU)IN 

I  J{xi)du 

J{R,{j)-l)/N 


where  U  is  Uniform{0, 1},  E[J[U)]  =  fg  J{u)du, 

VAR|J(t/)|  =  f  {J(u)  -  ElJ(Um^du. 
Jo 

Note  that  the  integral  in  the  definition  of  Tk{J)  equals 

f  J{u)d{Dk'{u) -u}. 

Jo 
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The  components  of  the  Knukal- Wallis  nonparametric  test  statistic  T KW^  for  testing 
the  equality  of  c  means  have  score  function  J(u)  =  u  —  .5  satisfying 

f;[J(r7)]  =  .5,  VAR(J(17)]  =  1/12. 

The  components  of  F  test  statistic  T*  have  score  function 

J(u)  =  {g‘(u)  -  Y-)la 

where  Q*(u)  is  sample  quantile  function  of  the  pooled  sample  Y . 

15.  GENERAL  DISTANCE  MEASURES.  General  measures  of  the  distance  of  D"{u) 
from  tt  and  of  <F[u)  from  1  axe  provided  by  the  integrals  from  0  to  1  of 

{<r(u)-i}2,  {£>»  -  u}V«(i  - »), 

where  (/‘(u)  is  a  smooth  version  of  <r(tt).  We  will  see  that  these  measures  can  be  decom¬ 
posed  into  components  which  may  provide  more  insight;  recall  basic  components  are  linear 
functionals  defined  by  (!) 

r'(J)  =  [  J{u)dr{u)du. 

JQ 

If  t  =  0, 1,2, . . are  complete  orthonormal  functions  with  =  1*  then  Hq  can 

be  tested  by  diagnosing  the  rate  of  increase  (as  a  function  of  m  =  1,2, . . .)  of 

■'0  ifl 

which  measure  the  distance  from  1  of  the  approximating  smooth  densities 

m 

i=l 

16.  ORTHOGONAL  POLYNOMIAL  CO  TONENTS.  Let  p,(z)  be  Legendre  poly¬ 
nomials  on  (-1,1): 

Pl{x)  =  X 

P2{x)  =  (3*2  -  i)/2, 

P3(^)  —  (5x  —  3x)/2, 

Pa{^)  ~  35*^  —  30*2  _j_  2^ 

Define  Legendre  polynomial  score  functions 

0L,(u)  =  (2t-l-l)-5pi(2u-l). 

One  can  show  that  an  Anderson-Darling  type  statistic,  denoted  AD(D“),  can  be  repre¬ 
sented 

AD{D')  =  f  {{i?‘'(u)  -  u}2/u(l  -  u)}du 
Jo 

=  f;  |r-(#ti)lV(*(>  + 1)) 

t=l 
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Define  cosine  score  functions  by 

4>Ci{u)  =  2'®cos(i7ru). 

One  can  show  that  a  Cramer-von  Mises  type  statistic,  denoted  CM{D~),  can  be  repre¬ 
sented 

CM(D~)=  [^{D’{u)-u}^du 

Jo 

t=l 

In  addition  to  Legendre  polynomial  and  cosine  components  we  consider  Hermite  poly¬ 
nomial  components  corresponding  to  Hermite  polynomial  score  functions 

where  Hi[x)  are  the  Hermite  polynomials: 

Hi{x)  =  X, 

H2{x)  =  x2  -  1, 

H^ix)  =  x^  -  3x, 

H^{x)  =  x^  -  6x^  +  3. 

17.  QUARTILE  COMPONENTS  AND  CHI-SQUARE.  Quartile  diagnostics  of  the 
null  hypothesis  Hq  are  provided  by  components  with  quartile  “square  wave”  score  functions 

SQi(u)  =  -2-5,  0  <  u  <  .25, 

=  0,  .25  <  u  <  .75, 

=  2  ®,  .75  <  u  <  1; 

0  <  u  <  .25, 

=  -1,  .25  <  u  <  .75, 

=  1,  .75  <  u  <  1; 

5Q3(u)  =0  if  0  <  u  <  .25  or  .75  <  u  <  1, 

=  -2  ®,  .25  <  tt  <  .5, 

=  2  ®,  .5  <  tt  <  .75. 

A  chi-squared  portmanteau  statistic,  which  is  chi-squared (3),  is 

3 

eg*  =  {N-  i)p,i,ni  ~  p,it)  Y, |r-(se.)l^ 

i=l 

=  {N  -  l)p.jfe/(l  -  p.jfe)  [  {dQjfc(u)  -  l}^du 

Jo 

defining  the  quartile  density  (for  t  =  1,2, 3, 4} 

dQM  =  4{£>*'(t(.25))  -  Dk'{{i  -  1).25),  (t  -  1).25  <  «  <  »(.25) 
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A  pooled  portmanteau  chi-squared  statbtic  is 

c 

CCI  =  '^(l-p.k)CQt 

k=l 

18.  DIVERSE  STATISTICS  AVAILABLE  TO  TEST  EQUALITY  OF  c  SAMPLES. 
The  problem  of  statistical  infereence  is  not  that  we  don’t  have  answers  to  a  given  question; 
usually  we  have  too  mamy  answers  and  we  don’t  know  which  one  to  choose.  A  unified 
framework  may  help  determine  optimum  choices.  To  compare  c  samples  we  can  compute 
the  following  functions  and  statistics: 

1)  comparison  densities:  dk'{u), 

2)  comparison  distributions 

3)  quartile  compao'ison  density  dQj^(u),  quartile  density  chi-square 

=  {N-  l)pjt/(l  -  p,t]  I  {dQM  -  l)^du. 

Jo 

4)  non-paraimetric  regression  smoothing  of  using  a  boundary  Epanechnikov  kernel, 

denoted  djt“(u), 

5)  Legendre  components  and  chi-squares  up  to  order  4  aure  defined  using  definition  (!)  of 
Tk- 

TLt(i)  = 

CLt(m)  = 

t=l 

CL(m)  =  ^(1  -  p.k)CL/t(m) 
k^l 

»=i 

c 

AD  = '£,{1  -  p,t)ADi, 
k=l 

6)  Cosine  components  and  chi-squares  up  to  order  4  are  defined: 

rc,(i)  =  Tt{4.c,) 

m 

CCt(m)  =  ^|rCt(.)|2 

i=l 

c 

CC(m)  =  2^(1  -  p.t)CCi,{m) 
k=l 

cA/t  =  f;irct(oP/(*>)^ 

»=1 

CM  =  ^(1  -  p.t)CMt 

k=l 
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7)  Hermite  cor-iponents  and  cht-aquaraa  up  to  order  4  are  deBned: 


THtti)  =  TkMBi) 

1=1 

e 

CH{m)  =  ^(1  -  P.k)CHk{m) 
k=l 


8)  density  estimators  djt"(u)  computed  from  components  up  to  order  4, 

9)  entropy  measures  with  penalty  terms  which  can  be  used  to  determine  how  many 

components  to  use  in  the  above  test  statistics 

19.  EXAMPLES  OF  DATA  ANALYSIS.  The  interpretation  of  the  diversity  of  statis¬ 
tics  available  is  best  illustrated  by  examples. 

In  order  to  compare  our  methods  with  others  available  we  consider  data  analysed  by 
Boos  (1986)  on  ratio  of  assessed  value  to  sale  price  of  residential  property  in  Fitchburg, 
Mass.,  1979.  The  samples  (denoted  I,  11,  III,  IV)  represent  dwellings  in  the  categories 
single-family,  two-family,  three-family,  four  or  more  families.  The  sample  sizes  (54,  43, 
31,  28)  are  proportions  .346,  .276,  .199,  .179  of  the  size  156  of  the  pooled  sample.  We 
compute  Legendre,  cosine,  Hermite  components  up  to  order  4  of  the  4  samples;  they  are 
asymptotically  standard  normal.  We  consider  components  greater  than  2  (3)  in  absolute 
value  to  be  significant  (very  significant). 

Legendre,  cosine,  and  Hermite  components  are  very  significant  only  for  sample  I, 
order  1  (-4.06,  -4.22,  -3.56  respectively).  Legendre  components  are  significant  for  sample 
IV,  orders  1  and  2  (2.19,  2.31).  Cosine  components  are  significant  for  sample  FV,  orders  I 
and  II  (2.36,  2.23)  and  sample  III,  order  1  (2.05).  Hermite  components  are  significant  for 
sample  IV,  orders  2  and  3  (2.7  and  -2.07). 

Conclusions  are  that  the  four  sampla  are  not  homogeneous  (have  the  same  distribu¬ 
tions).  Samples  I  and  IV  are  significantly  different  from  the  pooled  sample.  Estimators 
of  the  comparison  density  show  that  sample  I  is  more  likely  to  have  lower  values  than  the 
pooled  sample,  and  sample  IV  is  more  likely  to  have  higher  values.  While  all  the  statistical 
measures  described  above  have  been  computed,  the  insights  are  provided  by  the  linear  rank 
statistics  of  orthogonal  polynomials  rather  than  by  portmanteau  statistics  of  Crzmier-von 
Mises  or  Anderson-Darling  type. 

20.  CONCLUSIONS.  The  goal  of  our  recent  research  (see  Parzen  (1979),  (1983)) 
on  unifying  statistical  methods  (especially  using  quantile  function  concepts)  has  been  to 
help  the  development  of  both  the  theory  and  practice  of  statistical  data  analysis.  Our 
ultimate  aim  is  to  make  it  easier  to  apply  statistical  methods  by  unifying  them  in  ways 
that  increase  understanding,  and  thus  enable  researchers  to  more  easily  choose  methods 
that  provide  greatest  insight  for  their  problem.  We  b«:lieve  that  if  one  can  think  of  several 
ways  of  looking  at  a  data  analysis  one  should  do  so.  However  to  relate  and  compare  the 
answers,  and  thus  arrive  at  a  confident  conclxision,  a  general  framework  seems  to  us  to  be 
required. 

One  of  the  motivations  for  this  paper  was  to  understand  two-sample  tests  of  the 
Anderson-Darling  type;  they  are  discussed  by  Pettitt  (1976)  and  Scholz  and  Stephens 
(1987).  This  paper  provides  new  formulas  for  these  test  statistics  based  on  our  new  def¬ 
inition  of  sample  comparison  density  functions.  Asymptotic  distribution  theory  for  rank 
processes  defined  by  Parzen  (1983)  is  given  by  Aly,  Csorgo,  and  Horvath  (19871;  an  excel¬ 
lent  review  of  theory  for  rank  processes  is  given  by  Shorack  and  Wellner  (1986). 
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H'  ./ever  one  can  look  at  k  sample  Anderson-Dzu’ling  statistics  as  a  single  number 
formed  from  combining  many  test  statistics  called  components.  The  importance  of  com¬ 
ponents  is  also  advocated  by  Boos  (1986),  Eubank,  La  Riccia,  and  Rosenstein  (1987)  and 
Alexander  (1989).  Insight  is  greatly  increased  if  instead  of  basing  one’s  conclusions  on 
the  values  of  single  test  statistics,  one  looks  at  the  components  and  also  at  graphs  of  the 
densities  of  whi^  the  components  are  linear  functionals  corresponding  to  various  score 
functions.  The  question  of  which  score  functions  to  use  czm  be  answered  by  considering 
the  tail  behavior  of  the  distributions  that  seem  to  fit  the  data. 
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For  samples  I  and  IV,  sample  comparison  distribution  function  D'{u) 


